Multiple Instance Metric Learning from Automatically Labeled Bags of Faces

نویسندگان

  • Matthieu Guillaumin
  • Jakob J. Verbeek
  • Cordelia Schmid
چکیده

Metric learning aims at finding a distance that approximates a task-specific notion of semantic similarity. Typically, a Mahalanobis distance is learned from pairs of data labeled as being semantically similar or not. In this paper, we learn such metrics in a weakly supervised setting where “bags” of instances are labeled with “bags” of labels. We formulate the problem as a multiple instance learning (MIL) problem over pairs of bags. If two bags share at least one label, we label the pair positive, and negative otherwise. We propose to learn a metric using those labeled pairs of bags, leading to MildML, for multiple instance logistic discriminant metric learning. MildML iterates between updates of the metric and selection of putative positive pairs of examples from positive pairs of bags. To evaluate our approach, we introduce a large and challenging data set, Labeled Yahoo! News, which we have manually annotated and contains 31147 detected faces of 5873 different people in 20071 images. We group the faces detected in an image into a bag, and group the names detected in the caption into a corresponding set of labels. When the labels come from manual annotation, we find that MildML using the bag-level annotation performs as well as fully supervised metric learning using instance-level annotation. We also consider performance in the case of automatically extracted labels for the bags, where some of the bag labels do not correspond to any example in the bag. In this case MildML works substantially better than relying on noisy instance-level annotations derived from the bag-level annotation by resolving face-name associations in images with their captions.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Review of Multi-Instance Learning and Its applications

Multiple Instance Learning (MIL) is proposed as a variation of supervised learning for problems with incomplete knowledge about labels of training examples. In supervised learning, every training instance is assigned with a discrete or real-valued label. In comparison, in MIL the labels are only assigned to bags of instances. In the binary case, a bag is labeled positive if at least one instanc...

متن کامل

Multiple Instance Learning with Manifold Bags

In many machine learning applications, labeling every instance of data is burdensome. Multiple Instance Learning (MIL), in which training data is provided in the form of labeled bags rather than labeled instances, is one approach for a more relaxed form of supervised learning. Though much progress has been made in analyzing MIL problems, existing work considers bags that have a finite number of...

متن کامل

Multiple Instance Learning with Query Bags

In many machine learning applications, precisely labeled data is either burdensome or impossible to collect. Multiple Instance Learning (MIL), in which training data is provided in the form of labeled bags rather than labeled instances, is one approach for dealing with ambiguously labeled data. In this paper we argue that in many applications of MIL (e.g. image, audio, text, bioinformatics) a s...

متن کامل

Multiple-Instance Active Learning

We present a framework for active learning in the multiple-instance (MI) setting. In an MI learning problem, instances are naturally organized into bags and it is the bags, instead of individual instances, that are labeled for training. MI learners assume that every instance in a bag labeled negative is actually negative, whereas at least one instance in a bag labeled positive is actually posit...

متن کامل

Convex Formulation of Multiple Instance Learning from Positive and Unlabeled Bags

Multiple instance learning (MIL) is a variation of traditional supervised learning problems where data (referred to as bags) are composed of sub-elements (referred to as instances) and only bag labels are available. MIL has a variety of applications such as content-based image retrieval, text categorization, and medical diagnosis. Most of the previous work for MIL assume that training bags are ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010